IGNITE-27926 Implement Raft group destruction in Segstore#7958
IGNITE-27926 Implement Raft group destruction in Segstore#7958sashapolo wants to merge 1 commit intoapache:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Implements Raft group destruction support for segstore-backed Raft logs so that per-group log data can be logically tombstoned and later reclaimed by the segment GC.
Changes:
- Add
SegmentFileManager.destroyGroup(...)which writes a destroy tombstone via a reset marker. - Teach
IndexFileManagerto drop in-memory index metadata for a group when the destroy tombstone is checkpointed. - Add/extend tests to validate GC behavior and index-meta cleanup after group destroy.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| modules/raft/src/main/java/org/apache/ignite/internal/raft/storage/segstore/SegmentLogStorageManager.java | Switch group destruction to fileManager.destroyGroup(...) instead of reset(...). |
| modules/raft/src/main/java/org/apache/ignite/internal/raft/storage/segstore/SegmentFileManager.java | Introduce GROUP_DESTROY_LOG_INDEX sentinel and destroyGroup(...) wrapper over reset(...). |
| modules/raft/src/main/java/org/apache/ignite/internal/raft/storage/segstore/IndexFileManager.java | Remove per-group in-memory index metadata when a destroy sentinel is observed during checkpoint. |
| modules/raft/src/test/java/org/apache/ignite/internal/raft/storage/segstore/RaftLogGarbageCollectorTest.java | Add GC-focused tests covering segment/index file reclamation after group destruction. |
| modules/raft/src/test/java/org/apache/ignite/internal/raft/storage/segstore/IndexFileManagerTest.java | Add tests validating that destroy tombstones clear group index metadata on checkpoint. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| */ | ||
| static final long GROUP_DESTROY_LOG_INDEX = 0L; |
There was a problem hiding this comment.
GROUP_DESTROY_LOG_INDEX is set to 0, but the segstore code does not enforce log indices to be > 0 (tests in this module regularly append index 0). Overloading a valid log index as a destroy sentinel can lead to ambiguous semantics (e.g., reset may keep the entry at index 0 instead of fully discarding the group) and can also poison recovery: firstLogIndexInclusiveOnRecovery prefers segmentInfo.firstIndexKept(), so after a destroy tombstone it can return 0, which makes SegstoreLogStorage.getFirstLogIndex() return 0 (JRaft expects 1 for an empty log). Consider making the destroy marker unambiguous (dedicated record type/flag) or enforce/validate that all stored log indices are >= 1 and handle the destroy tombstone explicitly on recovery so destroyed groups recover as empty (first=1,last=0).
| */ | |
| static final long GROUP_DESTROY_LOG_INDEX = 0L; | |
| * Must not overlap with a valid stored log index. | |
| */ | |
| static final long GROUP_DESTROY_LOG_INDEX = -1L; |
| long firstIndexKept = metaSpec.firstIndexKept(); | ||
|
|
||
| if (firstIndexKept == GROUP_DESTROY_LOG_INDEX) { | ||
| groupIndexMetas.remove(groupId); | ||
|
|
||
| return; | ||
| } |
There was a problem hiding this comment.
This removal is keyed off firstIndexKept == GROUP_DESTROY_LOG_INDEX, but firstIndexKept is also used for regular prefix/reset tombstones and can legitimately be 0 if a group ever uses log index 0 (the segstore tests do). In that case, this would incorrectly drop all in-memory index metadata for an active group, making entries unreachable and potentially causing the GC to delete/compact files incorrectly. Please make the “group destroyed” signal unambiguous (e.g., a separate persisted flag/record marker) or add a strict invariant (validated at write/reset boundaries) that log indices are always >= 1 so 0 can never occur except as a destroy marker.
| // The destroy tombstone is in the current segment file's memtable. Trigger a rollover using a different group so the | ||
| // checkpoint thread processes the tombstone and removes GROUP_ID_1 from the in-memory index. | ||
| triggerAndAwaitCheckpoint(GROUP_ID_2, 0); | ||
|
|
There was a problem hiding this comment.
These tests verify that segment/index files are deleted/compacted after destroyGroup, but they don’t verify the primary correctness property: entries from the destroyed group must become unreadable via getEntry. Because the test data uses log index 0, a destroy implemented as reset(..., 0) can still leave index 0 readable (depending on existing memtable state), which would not be caught here. Please add assertions that getEntry(GROUP_ID_1, i) returns null for the previously appended indices (including 0), and/or consider using 1-based indices in the test to match JRaft’s empty-log contract (first=1,last=0).
| for (int i = 0; i < batches.size(); i++) { | |
| assertThat(getEntry(GROUP_ID_1, i), is((LogEntry) null)); | |
| } |
https://issues.apache.org/jira/browse/IGNITE-27926
Thank you for submitting the pull request.
To streamline the review process of the patch and ensure better code quality
we ask both an author and a reviewer to verify the following:
The Review Checklist
- There is a single JIRA ticket related to the pull request.
- The web-link to the pull request is attached to the JIRA ticket.
- The JIRA ticket has the Patch Available state.
- The description of the JIRA ticket explains WHAT was made, WHY and HOW.
- The pull request title is treated as the final commit message. The following pattern must be used: IGNITE-XXXX Change summary where XXXX - number of JIRA issue.
Notes